System and method for network caching

ABSTRACT

A system and method for caching network resources in an intermediary server topologically located between a client and a server in a network. The intermediate server preferably caches at both a back-end location and a front-end location. Intermediary server includes a cache and methods for loading content into the cache as according to rules specified by a site owner. Optionally, content can be proactively loaded into the cache to include content not yet requested. In another option, requests can be held at the cache when a prior request for similar content is pending.

RELATED APPLICATIONS

[0001] The present invention claims priority from U.S. ProvisionalPatent Application No. 60/197,490 entitled CONDUCTOR GATEWAY filed onApr. 17, 2000.

BACKGROUND OF THE INVENTION

[0002] 1. Field of the Invention.

[0003] The present invention relates, in general, to network informationaccess and, more particularly, to software, systems and methods forcaching request and/or response traffic in a data communication system.

[0004] 2. Relevant Background

[0005] Increasingly, business data processing systems, entertainmentsystems, and personal communications systems are implemented bycomputers across networks that are interconnected by internetworks(e.g., the Internet). The Internet is rapidly emerging as the preferredsystem for distributing and exchanging data. Data exchanges supportapplications including electronic commerce, broadcast and multicastmessaging, videoconferencing, gaming, and the like.

[0006] Currently, Internet services are implemented as client-serversystems. The client is typically implemented as a web browserapplication executing on a network-connected workstation or personalcomputer, although mail, news, file transfer and other Internet servicesare relatively common. The server is typically implemented as a webserver at a fixed network address. A client enters a uniform resourcelocator (URL) or selects a link pointing to a URL where the URLidentifies the server and particular content from the server that isdesired. The client request traverses the network to be received by theserver.

[0007] The server then obtains data necessary to compose a response tothe client request. For example, the response may comprise a hypertextmarkup language (HTML) document in a web-based application. HTML andother markup language documents comprise text, graphics, activecomponents, as well as references to files and resources at otherservers. In the case of static web pages, the web server may simplyretrieve the page from a file system, and send it in an HTTP responsepacket using conventional TCP/IP protocols and interfaces. In the caseof dynamically generated pages, the web server obtains data necessary togenerate a responsive page, typically through one or more databaseaccesses. The web server then generates a page, typically a markuplanguage document, that incorporates the retrieved data. Once generated,the web server sends the dynamic page in a manner similar to a staticpage.

[0008] Many web sites have pages or page elements that are viewed bymany users, and appear the same to all viewers. These page elementsinclude data in the form of text files, markup language documents,graphics files and the like, as well as active content such as scripts,applets, active controls and the like. It is an inefficient use ofbandwidth and server resources to resend the exact same data time aftertime from the server to different users.

[0009] Caching is designed to store copies of pages or page elements incaches located closer to the clients that are requesting the pages. Whena web page can be served out of a cache, the page is returned morequickly to the user. Caching reduces Internet traffic, for example, bystoring pages at the Internet service provider (ISP) the first time auser accesses the pages so that subsequent requests for the same page donot require that the origin server be contacted. Another advantage ofcaching is that because fewer requests reach the origin server, theorigin server load is reduced for a given number of users.

[0010] A key issue in systems that cache Internet resources is controlover the caching behavior. Conventional Internet caching methods allow asite owner to indicate desired caching behavior by specifying cachecontrol parameters in the HTTP packet headers. While HTTP v1.1specifications, defined in IETF RFC 2068, allow cache parameters, manyother network protocols do not offer these features. Also, while a website owner does not have actual control over the many caching mechanismsin the Internet and so cannot be positive that the desired cachingbehavior will in fact be implemented at each and every cache. As aresult, many site owners simply specify all web pages as non-cacheableto minimize risks associated with delivery of stale content, therebyforgoing the benefits of caching. A need exists for a system and methodthat provides for caching under control of rules specified by siteowners.

[0011] Another limitation of existing Internet caching mechanisms isthey lack effective means that enable a site owner to control thecontents of caches. Most cache systems are passive response caches thatwill cache a copy of a response to a given request, but will not placecontent in a cache before a specific request. In some applications,however, performance can be improved if the web site could controllablyload the cache by effectively pushing content from the web server out tothe cache. For example, when a server is under heavy load, it may bedesirable to expand the cache contents to explicitly prevent userrequests from hitting the central server. Hence, a need exists forsystems and methods enabling web site owners to actively manage networkcaches.

[0012] When the content being cached includes large files, it may take asignificant amount of time to communicate the response and fill thecache. Examples include multimedia files, multicast or broadcast files,software updates, and the like. In these cases, the likelihood that asubsequent request for the same content will come in while the initialrequest is being filled is greater. Existing cache solutions willcontinue to forward requests to the server until the entire set ofstatic files is cached so that the benefit of the cache is non-optimal.Particularly in situations where large files (e.g., streaming video,media, software updates, and the like) it is contemplated thatsubstantially simultaneous requests for the same content will befrequent. Accordingly, a need exists for a cache system and method thatperforms request merging to postpone requests while a prior request forthe same content is pending or “in-flight”.

[0013] Existing cache solutions are asynchronous in that the cachemechanism is unaware of operational and timing details within the originserver that might affect cache performance. For example, a particularweb page might be updated within the origin server every minute or everyhalf-hour. In such cases, the freshest content is obtained just afterthe origin server update, however, because the cache does not know theupdate cycle, it cannot adjust the caching behavior to maximize thedelivery of fresh content.

SUMMARY OF THE INVENTION

[0014] Briefly stated, the present invention involves a system forcaching network resources in an intermediary server topologicallylocated between a client and a server in a network. The intermediateserver preferably caches at both a back-end location and a front-endlocation. Intermediary server includes a cache and methods for loadingcontent into the cache as according to rules specified by a site owner.Optionally, content can be proactively loaded into the cache to includecontent not yet requested. In another option, requests can be held atthe cache when a prior request for similar content is pending.

BRIEF DESCRIPTION OF THE DRAWINGS

[0015]FIG. 1 illustrates a general distributed computing environment inwhich the present invention is implemented;

[0016]FIG. 2A shows in block-diagram form significant components of asystem in accordance with the present invention;

[0017]FIG. 2B shows in block-diagram form significant components of analternative system in accordance with the present invention;

[0018]FIG. 3 shows a domain name system used in an implementation of thepresent invention;

[0019]FIG. 4A shows components of FIG. 2A in greater detail;

[0020]FIG. 4B illustrates components of FIG. 2B in greater detail; and

[0021]FIG. 5 shows back-end components of FIG. 2B in greater detail.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

[0022] The present invention involves systems and methods for providingimproved performance in network communications through the use of acaching web server located at the client-side “edge” of a networkenvironment. This means that the web server implementing the cache inaccordance with the present invention is logically proximate to theclient application generating requests for web pages. It is contemplatedthat at least some of the requested content can be supplied by contentdata and resources persistently stored in a data store accessible by theweb server, but that at least some content will be obtained from remotenetwork resources such as remote data stores, remote web servers, andthe like. This remote data is obtained by the web server on behalf of arequesting client and then cached so that it can be used to respond tosubsequent requests. In a sense, the web server acts as a hybrid betweena web server and a caching proxy server in that it can supply bothoriginal content as well as cached remote content.

[0023] The present invention is illustrated and described in terms of adistributed computing environment such as an enterprise computing systemusing public communication channels such as the Internet. However, animportant feature of the present invention is that it is readily scaledupwardly and downwardly to meet the needs of a particular application.Accordingly, unless specified to the contrary, the present invention isapplicable to significantly larger, more complex network environments,including wireless network environments, as well as small networkenvironments such as conventional LAN systems.

[0024] Essentially, an intermediary server(s) is placed in communicationwith the client and server to participate in the request/responsetraffic between the client and server. In this position, theintermediary can be given specific knowledge of the configuration,capabilities and preferred formats of both the clients and servers. Theintermediary implements a cache for storing content including web-pages,web page components, files, graphics, program code and the like that aresubject to client requests. Moreover, an intermediary server can modifythe data that is subject to a client request and cache the modifieddata. For example, multimedia files may be formatted to a particularstandard, such as an MPEG standard, and cached in the modified form suchthat subsequent requests can be filled from cache without requiredformatting processes.

[0025] Although the present invention may use conventional cacheinstructions provided in HTTP headers, preferably the content includescaching parameters that provides independent instruction to the cachingsystem of the present invention. In this manner, the caches of thepresent invention may provide caching services even where the content ismarked uncacheable in the HTTP header.

[0026] Preferably, the cache contents are determinable by the site ownerby allowing the site owner to cause the cache to be loaded with desiredcontent. In this manner, content can be loaded into a cache before it isrequested by a user based on a likelihood that it will be requested by auser. In this manner, a web site owner can explicitly prevent hits tothe central server(s) by proactively sending content to a cache. Forexample, in the case of a small web site of a few megabytes of storage,the first hit to an index page may justify sending the entire site tothe intermediary server that received the hit thereby preventing anyfurther hits to the origin server.

[0027] Moreover, the home page or “index.html” page of the web site(i.e., the page most likely to be the first contact with a user) may becached on the intermediary permanently or until the website provides anupdated index page. A hit to the index page may result in serving theindex page from the cache, and a request from the intermediary server toload the remainder of the site, the linked pages, or an intermediate setof pages into the cache while the user views the index.html page.

[0028] In general, the present invention involves a combination ofpassive caching (i.e., cache contents are selected based on priorrequests) and active caching (i.e., cache contents are selectedspeculatively based on what is expected to be requested in the future).Speculative caching is used in a variety of environments, such as diskcaching, where data has a predictable spatial relationship with otherdata. However, in a network environment, particularly in a hyper-linkedenvironment, the spatial relationships cannot be relied upon. HTML, forexample, is specifically designed to access data in a non-sequential,user-controlled manner.

[0029] Accordingly, the present invention provides a system and methodin which a current request, or a response to that request, can be usedas a basis for both passively and actively caching network resourcesassociated with that request/response. A response can be passivelycached in a conventional manner. Additionally, however, the requestand/or response can be analyzed to determine other material that can beactively cached, such as data linked to the response.

[0030] This additional data may include dynamically generated data. Forexample, a current request may include state information such as acookie, that is intended to allow the origin server to personalizeresponses. The cache may be filled by making speculative requests to theserver using the cookie so that the cache is filled with dynamicallygenerated content. Alternatively, state information can be communicatedfrom the server in the initial response in the form of parametersencoded in links within the response. The cache may be filled by makingspeculative requests using the links within the current response so thatthe cache is filled with dynamically generated content.

[0031] In this manner, a cache can be intelligently and efficientlypopulated with data that has a high likelihood of being requested in thefuture even though that data may be stored in a variety of files andservers. Hence, the present invention provides a means to explicitlyprevent accesses to the origin server by moving content into the cachespeculatively, before it is requested.

[0032] Alternatively, speculative caching may be based on serverresources. A server under load may become less able to provide certaintypes of resources. For example, an e-commerce web server under a highshopping load may dedicate a large quantity of resources (i.e., I/Obuffers and memory) to shopping cart processes. This may make resourcesfor other activities (e.g., serving informational pages) scarce. Inaccordance with the present invention, an intermediary server is awareof server load and speculatively caches data that might otherwise taxscarce resources in the server itself.

[0033] While the examples herein largely involve web-based applicationsusing markup language documents, web browsers and web servers, theteachings are readily extended to other types of client-serverinformation exchange. For example, a database system can respond to aninitial query by loading the entire database or a selected portion of adatabase into the cache of an intermediary server. Subsequent requestscan be satisfied by the intermediary server. Likewise, a network filesystem (NFS) or file transfer protocol (FTP) system may respond to aninitial hit for a top level directory by loading the cache of anintermediary server with files from all the branches of the directory orwith a list of details for the files in the immediate subdirectories.These implementations are readily derived equivalents of the examplesgiven herein.

[0034] In a particular implementation, the intermediary server isimplemented by a front-end computer and a back-end computer that arecoupled over a network. This enables either or both of the front-end andback-end computers to perform the caching functions as needed. In thisconfiguration, a back-end cache will cache responses to multiplefront-ends. Conversely, the front-end caches will cache responsesreceived from multiple back-ends. Hence, each front-end and back-endcache is uniquely situated within the network topology to improve cacheperformance. For example, a back-end cache can readily identify cachecontents with high hit rates and propagate that content to all orselected front-end caches. This will result in requests being filled inthe front-end caches before they reach the back-ends.

[0035]FIG. 1 shows an exemplary computing environment 100 in which thepresent invention may be implemented. Environment 100 includes aplurality of local networks such as Ethernet network 102, FDDI network103 and Token Ring network 104. Essentially, a number of computingdevices and groups of devices are interconnected through a network 101.For example, local networks 102, 103 and 104 are each coupled to network101 through routers 109. LANs 102, 103 and 104 may be implemented usingany available topology and may implement one or more server technologiesincluding, for example UNIX, Novell, or Windows NT networks, orpeer-to-peer type network. Each network will include distributed storageimplemented in each device and typically includes some mass storagedevice coupled to or managed by a server computer. Network 101comprises, for example, a public network such as the Internet or anothernetwork mechanism such as a fibre channel fabric or conventional WANtechnologies.

[0036] Local networks 102, 103 and 104 include one or more networkappliances 107. One or more network appliances 107 may be configured asan application and/or file server. Each local network 102, 103 and 104may include a number of shared devices (not shown) such as printers,file servers, mass storage and the like. Similarly, devices 111 may beshared through network 101 to provide application and file services,directory services, printing, storage, and the like. Routers 109 providea physical connection between the various devices through network 101.Routers 109 may implement desired access and security protocols tomanage access through network 101.

[0037] Network appliances 107 may also couple to network 101 throughpublic switched telephone network 108 using copper or wirelessconnection technology. In a typical environment, an Internet serviceprovider 106 supports a connection to network 101 as well as PSTN 108connections to network appliances 107.

[0038] Network appliances 107 may be implemented as any kind of networkappliance having sufficient computational function to execute softwareneeded to establish and use a connection to network 101. Networkappliances 107 may comprise workstation and personal computer hardwareexecuting commercial operating systems such as Unix variants, MicrosoftWindows, Macintosh OS, and the like. At the same time, some appliances107 comprise portable or handheld devices using wireless connectionsthrough a wireless access provider such as personal digital assistantsand cell phones executing operating system software such as PalmOS,WindowsCE, and the like. Moreover, the present invention is readilyextended to network devices such as office equipment, vehicles, andpersonal communicators that make occasional connection through network101.

[0039] Each of the devices shown in FIG. 1 may include memory, massstorage, and a degree of data processing capability sufficient to managetheir connection to network 101. The computer program devices inaccordance with the present invention are implemented in the memory ofthe various devices shown in FIG. 1 and enabled by the data processingcapability of the devices shown in FIG. 1. In addition to local memoryand storage associated with each device, it is often desirable toprovide one or more locations of shared storage such as disk farm (notshown) that provides mass storage capacity beyond what an individualdevice can efficiently use and manage. Selected components of thepresent invention may be stored in or implemented in shared massstorage.

[0040] One feature of the present invention is that front-end servers201 (shown in FIG. 2B) and/or intermediate servers 206 (shown in FIG.2A) are implemented as an interchangeable pool of servers, any one ofwhich may be dynamically configured to receive request/response trafficof particular clients 205 and servers 210-212. The embodiments of FIG.2A and FIG. 2B are not strictly alternative as they may coexist in anetwork environment. A redirection mechanism, shown in FIG. 3, isenabled to select from an available pool of front-end servers 201 andintermediate servers 206 and direct client request packets from theoriginating web server to a selected front-end server 201 orintermediary server 206.

[0041] In the case of web-based environments, front-end 201 isimplemented using custom or off-the-shelf web server software. Front-end201 is readily extended to support other, non-web-based protocols,however, and may support multiple protocols for varieties of clienttraffic. Front-end 201 processes the data traffic it receives,regardless of the protocol of that traffic, to a form suitable fortransport by TMP 202 to a back-end 203. Hence, most of the functionalityimplemented by front-end 201 is independent of the protocol or format ofthe data received from a client 205. Hence, although the discussion ofthe exemplary embodiments herein relates primarily to front-end 201implemented as a web server, it should be noted that, unless specifiedto the contrary, web-based traffic management and protocols are merelyexamples and not a limitation of the present invention.

[0042] In the embodiment of FIG. 2A, intermediary servers 206 interactdirectly with server(s) 210-212. In the embodiment of FIG. 2B,intermediary server 206 is implemented as front-end computer 201 and aback-end computer 203. Front-end server 201 establishes and maintains anenhanced communication channel with a back-end server 203. In eitherembodiment, intermediary server 206, front-end 201 and/or back-end 203operate to cache response traffic flowing between a server 210-212 and aclient 205.

[0043] In the specific examples herein client 205 comprises anetwork-enabled graphical user interface such as a World Wide Web(“web”) browser. However, the present invention is readily extended toclient software other than conventional World Wide Web browser software.Any client application that can access a standard or proprietary userlevel protocol for network access is a suitable equivalent. Examplesinclude client applications that act as front ends for file transferprotocol (FTP) services, voice over Internet protocol (VoIP) services,network news protocol (NNTP) services, multi-purpose internet mailextensions (MIME) services, post office protocol (POP) services, simplemail transfer protocol (SMTP) services, as well as Telnet services. Inaddition to network protocols, the client application may serve as adatabase management system (DBMS) in which case the client applicationgenerates query language (e.g., structured query language or “SQL”)messages. In wireless appliances, a client application functions as afront-end to a wireless access protocol (WAP) service.

[0044]FIG. 2B illustrates an embodiment in which intermediary server 206is implemented by cooperative action of a front-end computer 201 and aback-end computer 203. Caching processes are performed by front-end 201,back-end 203, or both. Front-end mechanism 201 serves as an access pointfor client-side communications. In one example, front-end 201 comprisesa computer that sits “close” to clients 205. By “close”, “topologicallyclose” and “logically close” it is meant that the average latencyassociated with a connection between a client 205 and a front-end 201 isless than the average latency associated with a connection between aclient 205 and servers 210-212. Desirably, front-end computers have asfast a connection as possible to the clients 205. For example, thefastest available connection may be implemented in point of presence(POP) of an Internet service provider (ISP) 106 used by a particularclient 205. However, the placement of the front-ends 201 can limit thenumber of browsers that can use them. Because of this, in someapplications it may be more practical to place one front-end computer insuch a way that several POPs can connect to it. Greater distance betweenfront-end 201 and clients 205 may be desirable in some applications asthis distance will allow for selection amongst a greater numberfront-ends 201 and thereby provide significantly different routes to aparticular back-end 203. This may offer benefits when particular routesand/or front-ends become congested or otherwise unavailable.

[0045] Transport mechanism 202 is implemented by cooperative actions ofthe front-end 201 and back-end 203. Back-end 203 processes and directsdata communication to and from server(s) 210-212. Transport mechanism202 communicates data packets using a proprietary protocol over theInternet infrastructure in the particular example. Hence, the presentinvention does not require heavy infrastructure investments andautomatically benefits from improvements implemented in thegeneral-purpose network 101. Unlike the general-purpose Internet,front-end 201 and back-end 203 are programmably assigned to serveaccesses to a particular server 210-212 at any given time.

[0046] It is contemplated that any number of front-end and back-endmechanisms may be implemented cooperatively to support the desired levelof service required by the data server owner. The present inventionimplements a many-to-many mapping of front-ends to back-ends. Becausethe front-end to back-end mappings can by dynamically changed, a fixedhardware infrastructure can be logically reconfigured to map more orfewer front-ends to more or fewer back-ends and web sites or servers asneeded.

[0047] A particular advantage of the architectures shown in FIG. 2A andFIG. 2B is that they are readily scaled. In accordance with the presentinvention, not only can the data itself be distributed, but thefunctionality and behavior required to implement dynamic content (e.g.,dynamic web pages) is readily and dynamically ported to any of a numberof intermediary computers 206 and/or front-ends 201 and/or back-ends203. In this manner, any number of client machines 205 may be supported.To avoid congestion, additional front-ends 201 and/or intermediaryservers 206 may be implemented or assigned to particular servers210-212. Each front-end 201 and/or intermediary server 206 isdynamically re-configurable by updating address parameters to serveparticular web sites. Client traffic is dynamically directed toavailable front-ends 201 to provide load balancing.

[0048] In the examples, dynamic configuration is implemented by afront-end manager component 207 (shown only in FIG. 2B) thatcommunicates with multiple front-ends 201 and/or intermediary servers206 to provide administrative and configuration information tofront-ends 201. Each front-end 201 includes data structures for storingthe configuration information, including information identifying the IPaddresses of servers 210-212 to which they are currently assigned. Otheradministrative and configuration information stored in front-end 201and/or intermediary servers 206 may include information for prioritizingdata from and to particular clients, quality of service information, andthe like.

[0049] Similarly, additional back-ends 203 can be assigned to a web siteto handle increased traffic. Back-end manager component 209 couples toone or more back-ends 203 to provide centralized administration andconfiguration service. Back-ends 203 include data structures to holdcurrent configuration state, quality of service information and thelike. In the particular examples front-end manager 207 and back-endmanager 209 serve multiple servers 210-212 and so are able to manipulatethe number of front-ends and back-ends assigned to a particular server(e.g., server 210) by updating this configuration information. When thecongestion for the server 210 subsides, the front-end 201, back-end 203,and/or intermediary server 206 may be reassigned to other, busierservers. These and similar modifications are equivalent to the specificexamples illustrated herein.

[0050] In order for a client 205 to obtain service from a front-end 201or intermediate server 206, it must first be directed to a front-end 201or intermediate server 206. Preferably, client 205 initiates alltransactions as if it were contacting the originating server 210-212.FIG. 3 illustrates a domain name server (DNS) redirection mechanism thatillustrates how a client 205 is connected to a front-end 201. The DNSsystems is defined in a variety of Internet Engineering Task Force(IETF) documents such as RFC0883, RFC 1034 and RFC 1035 which areincorporated by reference herein. In a typical environment, a client 205executes a browser 301, TCP/IP stack 303, and a resolver 305. Forreasons of performance and packaging, browser 301, TCP/IP stack 303 andresolver 305 are often grouped together as routines within a singlesoftware product.

[0051] Browser 301 functions as a graphical user interface to implementuser input/output (I/O) through monitor 311 and associated keyboard,mouse, or other user input device (not shown). Browser 301 is usuallyused as an interface for web-based applications, but may also be used asan interface for other applications such as email and network news, aswell as special-purpose applications such as database access, telephony,and the like. Alternatively, a special-purpose user interface may besubstituted for the more general-purpose browser 301 to handle aparticular application.

[0052] TCP/IP stack 303 communicates with browser 301 to convert databetween formats suitable for browser 301 and IP format suitable forInternet traffic. TCP/IP stack also implements a TCP protocol thatmanages transmission of packets between client 205 and an Internetservice provider (ISP) or equivalent access point. IP protocol requiresthat each data packet include, among other things, an IP addressidentifying a destination node. In current implementations the IPaddress comprises a 32-bit value that identifies a particular Internetnode. Non-IP networks have similar node addressing mechanisms. Toprovide a more user-friendly addressing system, the Internet implementsa system of domain name servers that map alpha-numeric domain names tospecific IP addresses. This system enables a name space that is moreconsistent reference between nodes on the Internet and avoids the needfor users to know network identifiers, addresses, routes and similarinformation in order to make a connection.

[0053] The domain name service is implemented as a distributed databasemanaged by domain name servers (DNSs) 307 such as DNS_A, DNS_B and DNS_Cshown in FIG. 3. Each DNS relies on <domain name:IP> address mappingdata stored in master files scattered through the hosts that use thedomain system. These master files are updated by local systemadministrators. Master files typically comprise text files that are readby a local name server, and hence become available through the nameservers 307 to users of the domain system.

[0054] The user programs (e.g., clients 205) access name servers throughstandard programs such as resolver 305. Resolver 305 includes an addressof a DNS 307 that serves as a primary name server. When presented with areference to a domain name for a data server 210-212, resolver 305 sendsa request to the primary DNS (e.g., DNS_A in FIG. 3). The primary DNS307 returns either the IP address mapped to that domain name, areference to another DNS 307 which has the mapping information (e.g.,DNS_B in FIG. 3), or a partial IP address together with a reference toanother DNS that has more IP address information. Any number ofDNS-to-DNS references may be required to completely determine the IPaddress mapping.

[0055] In this manner, the resolver 305 becomes aware of the IP addressmapping which is supplied to TCP/IP component 303. Client 205 may cachethe IP address mapping for future use. TCP/IP component 303 uses themapping to supply the correct IP address in packets directed to aparticular domain name so that reference to the DNS system need onlyoccur once.

[0056] In accordance with the present invention, at least one DNS server307 is owned and controlled by system components of the presentinvention. When a user accesses a network resource (e.g., a database),browser 301 contacts the public DNS system to resolve the requesteddomain name into its related IP address in a conventional manner. In afirst embodiment, the public DNS performs a conventional DNS resolutiondirecting the browser to an originating server 210-212 and server210-212 performs a redirection of the browser to the system owned DNSserver (i.e., DNC_C in FIG. 3). In a second embodiment, domain:addressmappings within the DNS system are modified such that resolution of theoriginating server's domain automatically return the address of thesystem-owned DNS server (DNS_C). Once a browser is redirected to thesystem-owned DNS server, it begins a process of further redirecting thebrowser 301 to the best available front-end 201.

[0057] Unlike a conventional DNS server, however, the system-owned DNS_Cin FIG. 3 receives domain:address mapping information from a redirectorcomponent 309. Redirector 309 is in communication with front-end manager207 and back-end manager 209 to obtain information on current front-endand back-end assignments to a particular server 210. A conventional DNSis intended to be updated infrequently by reference to its associatedmaster file. In contrast, the master file associated with DNS_C isdynamically updated by redirector 309 to reflect current assignment offront-end 201 and back-end 203. In operation, a reference to data server210-212 may result in an IP address returned from DNS_C that points toany selected front-end 201 that is currently assigned to data server210-212. Likewise, data server 210-212 can identify a currently assignedback-end 203 by direct or indirect reference to DNS_C.

[0058] Despite the efficiency of the mechanisms shown in FIG. 3,redirection does take some time and it is preferable to send subsequentrequests for a particular server 210-212 directly to an assignedfront-end 201 or intermediary server 206 without redirection. When a webpage includes links with absolute references the browser 301 may attemptDNS resolution each time a link is followed. To prevent this, oneembodiment of the present invention rewrites these links as a part ofits reformatting process. In this manner, even though the servercontains only a page with absolute references, the page delivered to aclient contains relative references.

[0059]FIG. 4A illustrates a first embodiment in which a singleintermediary computer 206 is used, whereas FIG. 4B and FIG. 5 illustratea second embodiment where both front-end 201 and back-end 203 are usedto implement the intermediary server 206. In the embodiment of FIG. 4A,the intermediary server 206 may be located topologically near the client205 or data server 210-212—either alternative provides some advantageand the choice of location is made to meet the needs of a particularapplication. Like identified components are substantially equivalent inFIG. 4A, FIG. 4B and FIG. 5 and for ease of understanding are notduplicatively described herein. Also, the components shown in FIG. 4Aand FIG. 4B are optimized for web-based applications. Appropriatechanges to the components and protocols are made to adapt the specificexamples to other protocols and data types.

[0060] Requests from client 205 are received by a TCP unit 401. TCPcomponent 401 includes devices for implementing physical connectionlayer and Internet protocol (IP) layer functionality. Current IPstandards are described in IETF documents RFC0791, RFC0950, RFC0919,RFC0922, RFC792, RFC1112 that are incorporated by reference herein. Forease of description and understanding, these mechanisms are notdescribed in great detail herein. Where protocols other than TCP/IP areused to couple to a client 205, TCP component 401 is replaced oraugmented with an appropriate network protocol process.

[0061] TCP component 401 communicates TCP packets with one or moreclients 205. Preferably, TCP component 401 creates a socket for eachrequest, and returns a received response through the same socket.Received packets are coupled to parser 402 where the Internet protocol(or equivalent) information is extracted. TCP is described in IETFRFC0793 which is incorporated herein by reference. Each TCP packetincludes header information that indicates addressing and controlvariables, and a payload portion that holds the user-level data beingtransported by the TCP packet. The user-level data in the payloadportion typically comprises a user-level network protocol datagram.

[0062] Parser 402 analyzes the payload portion of the TCP packet. In theexamples herein, HTTP is employed as the user-level protocol because ofits widespread use and the advantage that currently available browsersoftware is able to readily use the HTTP protocol. In this case, parser402 comprises an HTTP parser. More generally, parser 402 can beimplemented as any parser-type logic implemented in hardware or softwarefor interpreting the contents of the payload portion. Parser 402 mayimplement file transfer protocol (FTP), mail protocols such as simplemail transport protocol (SMTP), structured query language (SQL), and thelike. Any user-level protocol, including proprietary protocols, may beimplemented within the present invention using appropriate modificationof parser 402.

[0063] In accordance with the present invention, intermediary 206 andfront-end 201 include a caching mechanism 403. Cache 403 may beimplemented as a passive cache that stores frequently and/or recentlyaccessed content such as pages or page elements. Cache 403 can also beimplemented as an active cache that stores content according tospecified rules including pages and/or page elements that have not beenrequested but that are anticipated to be accessed.

[0064] Upon receipt of a TCP packet, HTTP parser 402 determines if thepacket is making a request for content within cache 403. If the requestcan be satisfied from cache 403, the data is supplied directly withoutreference to server 210-212 (i.e., a cache hit). Cache 403 implementsany of a range of management functions for maintaining fresh content.For example, cache 403 may invalidate portions of the cached contentafter an expiration period specified with the cached data or by sever210-212. Also, cache 403 may proactively update expired cache contentsbefore a request is received for particularly important or frequentlyused data from data server 210-212 by requesting updated files from theback-end 203 and/or web server 210-212. Further, cache 403 may beproactively served with updated files(s) by back-end 203 and/or webserver 210-212. Cache 403 evicts information using any desired algorithmsuch as least recently used, least frequently used, first in/first out,or random eviction. When the requested data is not within cache 403, arequest is passed to server 210-212, and the returned data may be storedin cache 403. Some requests must be supplied to server 210-212 (e.g.,customer credit information, form data and the like).

[0065] Although the present invention may use conventional cacheinstructions provided in HTTP headers, preferably the content includescaching parameters that provides independent instruction to the cachingsystem of the present invention. In a particular example, each contentelement of a web site is associated with a data structure that specifiesa cache expiration date value and a cache update interval value. Theexpiration date value indicates a specific date that the associatedcontent element will expire. The date can be represented, for example,in seconds since Jan. 1, 1970. This member will be zero by default. Theupdate interval value indicates a time interval before the associatecontent will expire. The interval is expressed in seconds and has a zerovalue by default in the particular example.

[0066] Because the invention provides for passing cache instructionsthat are independent of any instructions in the HTTP header, the cachesof the present invention may provide caching services even where thecontent is marked uncacheable in the HTTP header, or where the cacheinterval or expiration values associated with a particular piece ofcontent differ from that specified in the HTTP header. Moreover, thepresent invention is readily extensible to non-HTTP protocols andlanguages that do not explicitly provide for any cache behavior such as,for example, FTP, structured query language (SQL), mail protocols andthe like.

[0067] In response to a cached content element passing its expirationdate or expiration interval, the content may either be removed from thecache or updated by reference to an origin server or other cache thathas fresher content. This update can be scheduled immediately or delayedto a more convenient time (e.g. when traffic volume is low).

[0068] Preferably, the cache contents are determinable by the site ownerby allowing the site owner to cause the cache to be loaded with desiredcontent. Several mechanisms are available to implement thisfunctionality. In one alternative, front-end manager can explicitly loadcontent into cache 403 as desired by a site owner. In anotheralternative, the cache instructions passed with a response packet mayindicate subsequent material to be added to cache 403. For example, whenan HTML page is loaded the cache instructions may indicate that alllinked resources referenced in the HTML page should be loaded into cache403. In a more aggressive example, when any one page of a web site210-212 is loaded, the entire web site is transferred to cache 403 whilethe user views the first accessed web page. In yet another example,cache contents are propagated by a back-end 203 to connected front-ends201 based upon propagation rules stored and implemented by back-end 203.For example, when the load on a back-end 203 is high and/or theresources of a back-end 203 become limited, the back-end can moreaggressively or speculatively move content out to front-ends 201 toprevent hits. These cache functions may occur as a result of explicitcache instructions, or may be automatically performed by front-end 201and/or intermediary 206 based on rules (e.g., site owner specifiedrules) stored in the server.

[0069] Caching can be performed on a file basis, block basis, or anyother unit of data regardless of its size. Irrespective of the size ofthe data unit in cache, cache content is kept current by, for example,causing the front-end 201 or intermediary server 206 to check cachecontents against a standard, such as a version of the data unit that ismaintained in origin server 210. When a difference between the cacheddata unit and the standard is detected, the cached unit may be updatedby copying the changed data unit, or changed portion of the data unit,from the standard to cache 403. By updating only changed material,rather than updating based upon fixed or static expiration times, datatransfers may be lessened in frequency and quantity, especially for slowchanging data.

[0070] Although back-end cache 503 is discussed below with reference toFIG. 5, it is important to note that any of the cache loading operationsdescribed above for cache 403 may be implemented using back-end cache503. For example, when a user accesses a first page of a web site, someor all of the web site may be loaded into back-end cache 503 rather thancache 403. As front-end cache 403 will typically contain content frommultiple web sites, it may be burdensome to load cache 403 immediately.Instead, content can be pushed out from cache 503 to cache 403 over timeas the user continues to access the web site. In this manner, a web siteowner can explicitly prevent hits to the central server(s) byproactively sending content to the various levels of cache. Similarly,either cache may proactively or speculatively fill its cache bygenerating requests to server 210

[0071] The default home page or “index.html” page of the web site (i.e.,the page most likely to be the first contact with a user) may be cachedon the intermediary or until the website provides an updated index pagepermanently. A hit to the index page will result in serving the indexpage from the cache, and a request from the intermediary server to loadthe all or part of the site into the cache 403 while the user views theindex.html page.

[0072] In the case of intermediary server 206, a request that cannot besatisfied from cache 403 is generally passed to transport component 409for communication to server 210-212 over channel 411. An alternative tothis operation is to determine from cache 403 whether an overlappingrequest is already pending at the server 210-212 for the same content.An overlapping request refers to a request that will generate a responsehaving at least some content elements as a request currently beingserved. In this circumstance, subsequently arriving requests can bequeued or buffered until the first request returns from the server210-212 and cache 403 is loaded, as the original request is likely to beresponded to prior to any subsequent requests for the same information.This may offer valuable performance improvements in situations where, ina particular example, a large file, such as a multimedia file, beingaccessed nearly simultaneously by a number of users would only berequested once from the back-end 203 or server 210. By “substantiallysimultaneously” it is meant that many requests for the same file orresource are received prior to receipt of the response from the webserver 210 or back-end 203 to the original request.

[0073] In an optional implementation, caches 403 and/or 503 include anInternet cache protocol (ICP) port for connection to each other andexternal cache mechanisms. ICP is described in IETF RFC 2186 as aprotocol and message format used for communicating between web caches tolocate specific objects in neighboring caches. Essentially, one cachesends an ICP query to a set of caches within neighboring front-ends 201,back-ends 203 and/or intermediary servers 206. The neighboring caches403 and 503 respond back with a “HIT” or “MISS” message indicatingwhether the requested content exists in the cache.

[0074] In a particular embodiment, transport component 409 implements aTCP/IP layer suitable for transport over the Internet or other IPnetwork. Transport component 409 creates a socket connection for eachrequest that corresponds to the socket created in transport component401. This arrangement enables responses to be matched to requests thatgenerated the responses. Channel 411 is compatible with an interface toserver 210-212 which may include Ethernet, Fibre channel, or otheravailable physical and transport layer interfaces.

[0075] In FIG. 4A, server 210-212 returns responses to transportcomponent 409 and supplies responses to parser 402. Parser 402implements similar processes with the HTTP response packets as describedhereinbefore with respect to request packets.

[0076] HTTP component 406 reassembles the response into a formatsuitable for use by client 205, which in the particular examples hereincomprises a web page transported as an HTTP packet. The HTTP packet issent to transport component 401 for communication to client 205 on thesocket opened when the corresponding request was received. In thismanner, from the perspective of client 205, the request has been servedby originating server 210-212.

[0077] In the embodiment of FIG. 4A and FIG. 5, intermediary server 206shown in FIG. 2A is implemented by front-end computer 201 and back-endcomputer 203. A front-end computer 201 refers to a computer located atthe client side of network 101 whereas a back-end computer 203 refers toa computer located at the server side of network 101. This arrangementenables caching to be performed at either or both computers. Hence, inaddition to caching data to serve needs of clients 205 and servers210-212, data can be cached to help regulate transport across thecommunication link 202 coupling front-end 201 and back-end 203. Forexample, back-end 203 can function as an active “look-ahead” cache tofill cache 503 with content from a server 210-212 aggressively accordingto criteria specified by the site owner, whereas front-end can cachemore passively by caching only responses to actual requests. Overallperformance is improved while controlling the load on TMP connection 202and on server 210-212.

[0078] Optionally, front-end 201, back end 203, and intermediarycomputer 206 implement security processes, compression processes,encryption processes and the like to condition the received data forimproved transport performance and/or provide additional functionality.These processes may be implemented within any of the functionalcomponents shown in FIG. 4A, FIG. 4B and FIG. 5 or implemented asseparate functional components within front-end 201, back-end 203 orintermediary 206.

[0079] In the embodiment of FIG. 4B and FIG. 5, the front-end 201 andback-end 202 are coupled by an enhanced communication channel (e.g., TMPlink 202). Blenders 404 and 504 slice and/or coalesce the data portionsof the received packets into more desirable “TMP units” that are sizedfor transport through the TMP mechanism 202. The data portion of TCPpackets may range in size depending on client 205 and any interveninglinks coupling client 205 to TCP component 401. Moreover, wherecompression or other reformatting is applied, the data will vary in sizedepending on the reformatting processes. Data blender 404 receivesinformation from front-end manager 207 that enables selection of apreferable TMP packet size. Alternatively, a fixed TMP packet size canbe set that yields desirable performance across TMP mechanism 202. Datablenders 404 and 504 also mark the TMP units so that they can bere-assembled at the receiving end.

[0080] Data blender 404 also serves as a buffer for storing packets fromall clients 205 that are associated with front-end 201. Similarly, datablender 504 buffers response packets destined for all the clients 205.Blender 404 mixes data packets coming into front-end 201 into a cohesivestream of TMP packets sent to back-end 203 over TMP link 202. Increating a TMP packet, blender 404 is able to pick and choose amongstthe available requests so as to prioritize some requests over others.Prioritization is effected by selectively transmitting request andresponse data from multiple sources in an order determined by a priorityvalue associated with the particular request and response. For purposesof the present invention, any algorithm or criteria may be used toassign a priority.

[0081] TMP mechanisms 405 and 505 implement the transport transportmorphing protocol™ (TMP™) packets used in the system in accordance withthe present invention. Transport morphing protocol and TMP aretrademarks or registered trademarks of Circadence corporation in theUnited States and other countries. TMP is a TCP-like protocol adapted toimprove performance for multiple channels operating over a singleconnection. Front-end TMP mechanism 405 in cooperation with acorresponding back-end TMP mechanism 505 shown in FIG. 5 are computerprocesses that implement the end points or sockets of TMP link 202. TheTMP mechanism in accordance with the present invention creates andmaintains a stable connection between two processes for high-speed,reliable, adaptable communication.

[0082] Another feature of TMP is its ability to channel numerous TCPconnections through a single TMP pipe 202. The environment in which TMPresides allows multiple TCP connections to occur at one end of thesystem. These TCP connections are then combined into a single TMPconnection. The TMP connection is then broken down at the other end ofthe TMP pipe 202 in order to traffic the TCP connections to theirappropriate destinations. TMP includes mechanisms to ensure that eachTMP connection gets enough of the available bandwidth to accommodate themultiple TCP connections that it is carrying.

[0083] An advantage of TMP as compared to traditional protocols is theamount of information about the quality of the connection that a TMPconnection conveys from one end to the other of a TMP pipe 202. As oftenhappens in a network environment, each end has a great deal ofinformation about the characteristics of the connection in onedirection, but not the other. By knowing about the connection as awhole, TMP can better take advantage of the available bandwidth.

[0084] In a particular example, each TMP packet includes a headerportion in which some area is available for transfer of commands betweena back-end 203 and a front-end 201. For example, a “Fill_Cache” commandcan be defined having a parameter specifying a URL having content thatis to be loaded into the cache of the recipient of the command. In thismanner, a front-end 201 can explicitly cause a back-end to fill itscache with specified content without requiring the content to beforwarded over the TMP link. Similarly, a back-end 203 can send explicitcache fill instructions to a front-end 201. Alternatively, front-end 201and back-end 201 can communicate cache fill instructions throughmanagement components 207 and 209. In yet another alternative, cachefill procedures can be implemented by remote procedure calls (RPCs). Forexample, a back-end 203 can issue a cache fill RPC to a front-end 201that when executed on front-end 201 will cause front-end end 201 torequest the specified content even though no client request has beenreceived. The passive caching mechanism in front-end 201 will thenperform the desired cache fill operation on the returned data.

[0085] Although the invention has been described and illustrated with acertain degree of particularity, it is understood that the presentdisclosure has been made only by way of example, and that numerouschanges in the combination and arrangement of parts can be resorted toby those skilled in the art without departing from the spirit and scopeof the invention, as hereinafter claimed. For example, while devicessupporting HTTP data traffic are used in the examples, the HTTP devicesmay be replaced or augmented to support other public and proprietaryprotocols including FTP, NNTP, SMTP, SQL and the like. In suchimplementations the front-end 201 and/or back end 203 are modified toimplement the desired protocol. Moreover, front-end 201 and back-end 203may support different protocols such that the front-end 201 supports,for example, HTTP traffic with a client and the back-end supports a DBMSprotocol such as SQL. Such implementations not only provide theadvantages of the present invention, but also enable a client to accessa rich set of network resources with minimal client software.

We claim:
 1. A system for caching network resources comprising: a serverhaving network resources stored thereon; a client generating requestsfor the network resources; an intermediary server configured to receiverequests from the client and retrieve the network resources from theserver; a cache controlled by the intermediary server for cachingselected network resources, wherein the cached resources include morethan the requested resources and wherein at least some of the cachedresources are selected both in response to the request and explicitlyselected to prevent future client requests from being communicated tothe server.
 2. The system of claim 1 wherein the cache includes only ahome page for at least one web site.
 3. The system of claim 1 whereinthe intermediary server comprises a front-end computer and a back-endcomputer.
 4. The system of claim 3 wherein both the front-end computerand the back-end computer implement a cache data structure.
 5. Thesystem of claim 4 further comprising: a first page cached on thefront-end computer cache, the first page associated with a plurality ofother resources, wherein the other resources are cached on the back-endcomputer cache.
 6. The system of claim 5 wherein the association isexplicit in links within the first page that point to the secondaryresources.
 7. The system of claim 5 wherein the association is implicitin user access patterns.
 8. The system of claim 5 wherein theassociation is explicitly defined by the site owner.
 9. The system ofclaim 1 wherein the cache is configured to store web pages and elementsthereof.
 10. The system of claim 1 wherein the cache is configured tostore program constructs comprising software code, applets, scripts,active controls.
 11. The system of claim 1 wherein the cache isconfigured to store files.
 12. The system of claim 1 further comprising:means within the intermediary server for merging a current request fornetwork resources that are not in the cache with a prior issued pendingrequest for the same network resources.
 13. A cache system comprising: acommunication network; a plurality of network-connected intermediaryservers each having an interface for receiving client requests fornetwork resources, each intermediary server having a cache associatedtherewith; communication channels linking each intermediary server witha set of neighboring intermediary servers for exchanging cache contentsamongst the intermediary servers.
 14. A method for caching network datacomprising: communicating request-response traffic between two or morenetwork-connected computing appliances; implementing a cache coupled tothe request-response traffic; and selectively placing data from therequest-response traffic into the cache at least partially based uponattributes of the client and/or server associated with therequest-response traffic.
 15. The method of claim 14 further comprising:associating client attributes with the request-response traffic, theclient attributes associating a relative priority with the traffic,wherein the act of selectively placing is at least partially based uponthe client attribute.
 16. The method of claim 14 further comprising:associating client attributes with the request-response traffic, theclient attributes associating a service level with the traffic, whereinthe act of selectively placing is at least partially based upon theclient attribute.
 17. The method of claim 14 further comprising:associating client attributes with the request-response traffic, theclient attributes associating a service level with the traffic, whereinthe act of selectively placing is at least partially based upon the aserver assigned priority.
 18. A cache system comprising: a front-endserver implementing a first cache and configured to receive clientrequests and generate responses to the client requests; a back-endserver implementing a second cache and configured to receive requestsfrom the front-end server and generate responses to the front-endserver; an origin server having content stored thereon; a communicationchannel linking the front-end server and the back-end server; a cachemanagement mechanism in communication with the front-end computer andthe back-end computer to selectively fill the first and second caches.19. The cache system of claim 18 wherein the cache management mechanismcomprises a process within the front-end server for receiving responsesto client requests and placing the received responses in the cache. 20.The cache system of claim 18 wherein the cache management mechanismcomprises a process within the front-end server for generating, suasponte, requests and placing the responses to the sua sponte requests inthe cache.
 21. The cache system of claim 18 wherein the cache managementmechanism comprises processes for populating one cache with contentsfrom another cache.
 22. A system for caching network resourcescomprising: a plurality of intermediary servers configured to receiveclient requests and retrieve request-specified network resources; acache implemented within each of the intermediary servers and configuredto store selected network resources; a resolver mechanism for supplyinga network address of the intermediary server to the client applications,wherein the resolver mechanism dynamically selects a particularintermediary server from amongst the plurality of intermediary serversbased at least in part on the content of each intermediary server'scache.
 23. The system of claim 22 further comprising: a redirectionmechanism within a first of the intermediary servers configured toredirect a client request from the first intermediary server to a secondof the intermediary servers based at least in part on the content of thefirst and second intermediary server's caches.
 24. A cache systemcomprising: a first front-end server implementing a first cache andconfigured to receive client requests and generate responses to theclient requests; a second front-end server implementing a second cacheand configured to receive client requests and generate responses to theclient requests; an origin server having content stored thereon; acommunication channel linking the first front-end server and the secondfront-end server; a cache management mechanism in communication with thefirst and second front-end computers to selectively fill the secondcache in response to a client request received by the first front-endserver.
 25. The cache system of claim 24 wherein the cache managementmechanism selectively updates the second cache based upon knowledge thatsubsequent client requests will be directed to the second front-endserver.
 25. The cache system of claim 24 wherein the cache managementmechanism selectively updates the second cache based upon anticipationthat subsequent client requests will be directed to the second front-endserver.
 27. A method of speculatively caching Internet contentcomprising: receiving a current request for specified content; obtainingthe specified content in response to the current request; andspeculatively caching data in addition to the specified content.
 28. Themethod of claim 27 wherein the act of speculatively caching datacomprises determining data that is likely to be requested subsequent tothe current request.
 29. The method of claim 27 wherein the act ofspeculatively caching data comprises: determining an ability for aserver to respond to subsequent requests for the data; and speculativelycaching data when it is determined that the server's ability to respondto subsequent requests is less than a preselected level.